Inverted Index based Modified Version of KNN for Text Categorization

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inverted Index based Modified Version of KNN for Text Categorization

This research proposes a new strategy where documents are encoded into string vectors and modified version of KNN to be adaptable to string vectors for text categorization. Traditionally, when KNN are used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in...

متن کامل

Inverted Index based Modified Version of K-Means Algorithm for Text Clustering

This research proposes a new strategy where documents are encoded into string vectors and modified version of k means algorithm to be adaptable to string vectors for text clustering. Traditionally, when k means algorithm is used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classifi...

متن کامل

Using kNN Model-based Approach for Automatic Text Categorization

An investigation has been conducted on two well known similarity-based learning approaches to text categorization: the k-nearest neighbor (k-NN) classifier and the Rocchio classifier. After identifying the weakness and strength of each technique, a new classifier called the kNN model-based classifier (kNNModel) has been proposed. It combines the strength of both k-NN and Rocchio. A text categor...

متن کامل

Svm Based Improvement in Knn for Text Categorization

ABSTRACTIn today‟s library science, information and computer science, online text classification or text categorization is a huge complication. [1]With the enormous growth of online information and data, text categorization has become one of the crucial techniques for handling and standardizing text data. Various learning algorithms have been applied on text for categorization. On the basis of ...

متن کامل

Improving kNN Text Categorization by Removing Outliers from Training Set

We show that excluding outliers from the training data significantly improves kNN classifier, which in this case performs about 10% better than the best know method—Centroid-based classifier. Outliers are the elements whose similarity to the centroid of the corresponding category is below a threshold.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Information Processing Systems

سال: 2008

ISSN: 1976-913X

DOI: 10.3745/jips.2008.4.1.017